Goto

Collaborating Authors

 client-side model




Communication and Computation Efficient Split Federated Learning in O-RAN

arXiv.org Artificial Intelligence

The hierarchical architecture of Open Radio Access Network (O-RAN) has enabled a new Federated Learning (FL) paradigm that trains models using data from non- and near-real-time (near-RT) Radio Intelligent Controllers (RICs). However, the ever-increasing model size leads to longer training time, jeopardizing the deadline requirements for both non-RT and near-RT RICs. To address this issue, split federated learning (SFL) offers an approach by offloading partial model layers from near-RT-RIC to high-performance non-RT-RIC. Nonetheless, its deployment presents two challenges: (i) Frequent data/gradient transfers between near-RT-RIC and non-RT-RIC in SFL incur significant communication cost in O-RAN. (ii) Proper allocation of computational and communication resources in O-RAN is vital to satisfying the deadline and affects SFL convergence. Therefore, we propose SplitMe, an SFL framework that exploits mutual learning to alternately and independently train the near-RT-RIC's model and the non-RT-RIC's inverse model, eliminating frequent transfers. The ''inverse'' of the inverse model is derived via a zeroth-order technique to integrate the final model. Then, we solve a joint optimization problem for SplitMe to minimize overall resource costs with deadline-aware selection of near-RT-RICs and adaptive local updates. Our numerical results demonstrate that SplitMe remarkably outperforms FL frameworks like SFL, FedAvg and O-RANFed regarding costs and convergence.


Federated Split Learning with Improved Communication and Storage Efficiency

arXiv.org Artificial Intelligence

--Federated learning (FL) is one of the popular distributed machine learning (ML) solutions but incurs significant communication and computation costs at edge devices. Federated split learning (FSL) can train sub-models in parallel and reduce the computational burden of edge devices by splitting the model architecture. However, it still requires a high communication overhead due to transmitting the smashed data and gradients between clients and the server in every global round. Furthermore, the server must maintain separate partial models for every client, leading to a significant storage requirement. T o address these challenges, this paper proposes a novel communication and storage efficient federated split learning method, termed CSE-FSL, which utilizes an auxiliary network to locally update the weights of the clients while keeping a single model at the server, hence avoiding frequent transmissions of gradients from the server and greatly reducing the storage requirement of the server . Additionally, a new model update method of transmitting the smashed data in selected epochs can reduce the amount of smashed data sent from the clients. We provide a theoretical analysis of CSE-FSL, rigorously guaranteeing its convergence under non-convex loss functions. The extensive experimental results further indicate that CSE-FSL achieves a significant communication reduction over existing FSL solutions using real-world FL tasks. S an emerging distributed machine learning (ML) paradigm, federated learning (FL) [2] enables distributed clients to collaboratively train ML models without uploading their sensitive data to a central server. While FL addresses privacy concerns by keeping data localized, most existing FL approaches assume that clients possess sufficient computational and storage resources to perform local updates on large (potentially deep) models. However, this assumption breaks down in scenarios where clients, such as mobile and Internet-of-Things (IoT) devices, are resource-constrained. Consequently, these clients struggle to handle the heavy computational and storage demands of training deep ML models, making FL impractical in such settings. To address this issue, split learning (SL) [3]-[5] is proposed.


FSL-SAGE: Accelerating Federated Split Learning via Smashed Activation Gradient Estimation

arXiv.org Artificial Intelligence

Collaborative training methods like Federated Learning (FL) and Split Learning (SL) enable distributed machine learning without sharing raw data. However, FL assumes clients can train entire models, which is infeasible for large-scale models. In contrast, while SL alleviates the client memory constraint in FL by offloading most training to the server, it increases network latency due to its sequential nature. Other methods address the conundrum by using local loss functions for parallel client-side training to improve efficiency, but they lack server feedback and potentially suffer poor accuracy. We propose FSL-SAGE (Federated Split Learning via Smashed Activation Gradient Estimation), a new federated split learning algorithm that estimates server-side gradient feedback via auxiliary models. These auxiliary models periodically adapt to emulate server behavior on local datasets. We show that FSL-SAGE achieves a convergence rate of $\mathcal{O}(1/\sqrt{T})$, where $T$ is the number of communication rounds. This result matches FedAvg, while significantly reducing communication costs and client memory requirements. Our empirical results also verify that it outperforms existing state-of-the-art FSL methods, offering both communication efficiency and accuracy.


Collaborative Split Federated Learning with Parallel Training and Aggregation

arXiv.org Artificial Intelligence

Federated learning (FL) operates based on model exchanges between the server and the clients, and it suffers from significant client-side computation and communication burden. Split federated learning (SFL) arises a promising solution by splitting the model into two parts, that are trained sequentially: the clients train the first part of the model (client-side model) and transmit it to the server that trains the second (server-side model). Existing SFL schemes though still exhibit long training delays and significant communication overhead, especially when clients of different computing capability participate. Thus, we propose Collaborative-Split Federated Learning(C-SFL), a novel scheme that splits the model into three parts, namely the model parts trained at the computationally weak clients, the ones trained at the computationally strong clients, and the ones at the server. Unlike existing works, C-SFL enables parallel training and aggregation of model's parts at the clients and at the server, resulting in reduced training delays and commmunication overhead while improving the model's accuracy. Experiments verify the multiple gains of C-SFL against the existing schemes. Keywords: Split learning Distributed AI systems 1 Introduction The wide deployment of devices that gather vast amounts of data, along with the stringent data privacy requirements of numerous applications (such as healthcare and natural language processing applications), drives the adoption of distributed learning schemes, like Federated Learning (FL) [9]. In FL, clients update their local model for multiple local epochs and send it to a server, which aggregates all local models and transmits the aggregated model back to the clients for the next training round. This synchronous process is repeated for multiple rounds.


Federated Split Learning with Model Pruning and Gradient Quantization in Wireless Networks

arXiv.org Artificial Intelligence

As a paradigm of distributed machine learning, federated learning typically requires all edge devices to train a complete model locally. However, with the increasing scale of artificial intelligence models, the limited resources on edge devices often become a bottleneck for efficient fine-tuning. To address this challenge, federated split learning (FedSL) implements collaborative training across the edge devices and the server through model splitting. In this paper, we propose a lightweight FedSL scheme, that further alleviates the training burden on resource-constrained edge devices by pruning the client-side model dynamicly and using quantized gradient updates to reduce computation overhead. Additionally, we apply random dropout to the activation values at the split layer to reduce communication overhead. We conduct theoretical analysis to quantify the convergence performance of the proposed scheme. Finally, simulation results verify the effectiveness and advantages of the proposed lightweight FedSL in wireless network environments.


Federated Split Learning for Human Activity Recognition with Differential Privacy

arXiv.org Artificial Intelligence

This paper proposes a novel intelligent human activity recognition (HAR) framework based on a new design of Federated Split Learning (FSL) with Differential Privacy (DP) over edge networks. Our FSL-DP framework leverages both accelerometer and gyroscope data, achieving significant improvements in HAR accuracy. The evaluation includes a detailed comparison between traditional Federated Learning (FL) and our FSL framework, showing that the FSL framework outperforms FL models in both accuracy and loss metrics. Additionally, we examine the privacy-performance trade-off under different data settings in the DP mechanism, highlighting the balance between privacy guarantees and model accuracy. The results also indicate that our FSL framework achieves faster communication times per training round compared to traditional FL, further emphasizing its efficiency and effectiveness. This work provides valuable insight and a novel framework which was tested on a real-life dataset.


Personalized Hierarchical Split Federated Learning in Wireless Networks

arXiv.org Artificial Intelligence

Extreme resource constraints make large-scale machine learning (ML) with distributed clients challenging in wireless networks. On the one hand, large-scale ML requires massive information exchange between clients and server(s). On the other hand, these clients have limited battery and computation powers that are often dedicated to operational computations. Split federated learning (SFL) is emerging as a potential solution to mitigate these challenges, by splitting the ML model into client-side and server-side model blocks, where only the client-side block is trained on the client device. However, practical applications require personalized models that are suitable for the client's personal task. Motivated by this, we propose a personalized hierarchical split federated learning (PHSFL) algorithm that is specially designed to achieve better personalization performance. More specially, owing to the fact that regardless of the severity of the statistical data distributions across the clients, many of the features have similar attributes, we only train the body part of the federated learning (FL) model while keeping the (randomly initialized) classifier frozen during the training phase. We first perform extensive theoretical analysis to understand the impact of model splitting and hierarchical model aggregations on the global model. Once the global model is trained, we fine-tune each client classifier to obtain the personalized models. Our empirical findings suggest that while the globally trained model with the untrained classifier performs quite similarly to other existing solutions, the fine-tuned models show significantly improved personalized performance.


GAS: Generative Activation-Aided Asynchronous Split Federated Learning

arXiv.org Artificial Intelligence

Split Federated Learning (SFL) splits and collaboratively trains a shared model between clients and server, where clients transmit activations and client-side models to server for updates. Recent SFL studies assume synchronous transmission of activations and client-side models from clients to server. However, due to significant variations in computational and communication capabilities among clients, activations and client-side models arrive at server asynchronously. The delay caused by asynchrony significantly degrades the performance of SFL. To address this issue, we consider an asynchronous SFL framework, where an activation buffer and a model buffer are embedded on the server to manage the asynchronously transmitted activations and client-side models, respectively. Furthermore, as asynchronous activation transmissions cause the buffer to frequently receive activations from resource-rich clients, leading to biased updates of the server-side model, we propose Generative activations-aided Asynchronous SFL (GAS). In GAS, the server maintains an activation distribution for each label based on received activations and generates activations from these distributions according to the degree of bias. These generative activations are then used to assist in updating the server-side model, ensuring more accurate updates. We derive a tighter convergence bound, and our experiments demonstrate the effectiveness of the proposed method.